Part I: Prediction
Estimate the value of
salesin an unobservedTVinvestment quantities.
Part II: Causal Inference
How much does
TVinvestment affectsales?
read.csv) the Advertising.csv data set (from webpage>Data Sets)plot function). Add the previous calculated point (using points function)Sales increase, when I spended one more unit of TV advertisement in this line?linear_model <- lm(sales ~ TV, ads)
summary(linear_model)##
## Call:
## lm(formula = sales ~ TV, data = ads)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.3860 -1.9545 -0.1913 2.0671 7.2124
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.032594 0.457843 15.36 <2e-16 ***
## TV 0.047537 0.002691 17.67 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.259 on 198 degrees of freedom
## Multiple R-squared: 0.6119, Adjusted R-squared: 0.6099
## F-statistic: 312.1 on 1 and 198 DF, p-value: < 2.2e-16
Residuals may mean?Std. Error? How can we increase/decrease it?Which is the predicted value at \(TV=44.5\)?
df <- data.frame(TV=44.5)
predict(linear_model, df)Exercise: Calculate the Mean Squared Error
Given a data set \(\{x_i, y_i\}\) we want to find the line that minimizes the sum of errors
\[S(a, b) = (y_1 - (a + b x_1))^2 + ... + (y_n - (a + b x_n))^2 = \sum_i (y_i - a - b x_i)^2\] Question: In the formula above, which are the unknowns?
\[\min_{a, b} \sum_i (y_i - a - b x_i)^2 \]
Consider \(\bar x = \mbox{mean}(x)\) and \(\bar y = \mbox{mean}(y)\):
\[\hat b = \frac{Cov(x, y)}{Var(x)} = \frac{\sum_i (x_i - \bar x)(y_i - \bar y) }{\sum_i (x_i-\bar x)^2} \]
\[\hat{a} = \bar y - \hat b \bar x\]
\[r_{xy} = \hat b \frac{\sqrt{Var(x)}}{\sqrt{Var(y)}} = \frac{\sum_i (x_i - \bar x)(y_i - \bar y)}{\sqrt{\sum_i (x_i - \bar x)^2 }\sqrt{\sum_i (y_i - \bar y)^2 }}\] where \(-1\leq r_{xy} \leq 1\).
What if we want to take into account more information?
radio advertisement?newspapers advertisement?Questions: What advantages do you expect by adding more variables?
\[ales = \alpha_0 + alpha_1 \times \mbox{TV} + \alpha_2 \times \mbox{radio} + \alpha_3 \times \mbox{ newspapers}\]
Covariates X (matrix)
ads[, 2:4] %>% head## TV radio newspaper
## 1 230.1 37.8 69.2
## 2 44.5 39.3 45.1
## 3 17.2 45.9 69.3
## 4 151.5 41.3 58.5
## 5 180.8 10.8 58.4
## 6 8.7 48.9 75.0
Response Y
head(ads[, 'sales', drop=FALSE])## sales
## 1 22.1
## 2 10.4
## 3 9.3
## 4 18.5
## 5 12.9
## 6 7.2
\[\left[ \begin{array}{ccc} intercept & TV & radio & newspapers \\ 1 & 230.1 & 37.8 & 69.2 \\ 1 & 44.5 & 39.3 & 45.1 \\ ... & ... & ... & ... \end{array} \right] \left[ \begin{array}{c} \alpha_0 \\ \alpha_1 \\ alpha_2 \\alpha_3 \end{array}\right] = \left[ \begin{array}{c} sales \\ 22.1 \\ 10.4 \\ ... \end{array}\right]\]
or equivalently
\[X \alpha = Y\]
\[\min_{\alpha_0, \alpha_1, \alpha_2, \alpha_3} \sum_i (y_i - (\alpha_0 + \alpha_1 \times \mbox{TV}_i + \alpha_2 \times \mbox{radio}_i + \alpha_3 \times \mbox{ newspapers}_i))^2 \]
\[\hat \alpha = (X^tX)^{-1}X^t Y\]